Arithmetic Coding on Logarithm Domain

نویسندگان

  • Wei Yu
  • Ping Yang
  • Yun He
چکیده

arithmetic coding (AC) is notable for its high compression gain and high implementation cost. The coding process includes range update, renormalization, and probability estimation update. Old AC engine realized both range update and probability estimation update by multiplications or table look-up. In this paper, we replace multiples with additions by combining original domain and logarithmic domain. We kept the cost of alternation between two domains low by employing approximation and optional modification. Moreover, renormalization takes place only when least probable symbol happens so that much time can be saved for renormalization process. Overall coding speed and implementation cost is improved compared to old AC engine. Context Adaptive Binary Arithmetic Coding (CABAC) [1] was one of the two alternative entropy coding method specified in H.264. Compared to the other technique Context Adaptive Variable Length Coding (CAVLC), CABAC entails an access frequency increase from 25% to 30% with bit rate reduction up to 16% [2]. The original principle of the binary arithmetic coding is based on recursive interval subdivision of the interval width R [3]. Given the estimation of the probability LPS p of least probable symbol (LPS), the interval is subdivided into two subintervals: one interval width = LPS rLPS R p ⋅ which is associated with the LPS, and the other interval width = rMPS R rLPS − , which is assigned to the most probable symbol (MPS). Depending on whether the observed bin to be encoded is MPS or LPS, the corresponding subinterval is chosen as the new interval. The binary arithmetic coding process keeps updating two registers, i.e. the interval width register R which marks the range of the interval and the code register C which marks the lower bound of the interval. Q-coder (1988) [4]-[5] is a landmark of practical arithmetic coder. It applied a simple coarse approximation to calculating LPS R p ⋅ and applied table look-up to adaptively estimating LPS p . CABAC inherits the table-based mechanism in Q-coder but achieves better compression ratio. In CABAC, approximated multiplication results are pre-stored in a fixed table. Range value is approximated by four quantized values using an equal-partition of the whole range 8 9 2 2 R ≤ < into four cells. The value of LPS p is approximated by 64 quantized values indexed by the 6-bit state value. With those 2-bit of range used as the quantization index for R and the 6-bit probability state value used as the quantization index for LPS p value, the approximated subinterval range value rLPS is looked up from a 4×64 2-D table. The probability estimation process is also table-based. State value is updated after encoding each bin by looking up a fixed table. There are two problems with the current AC engines. One is the high computational complexity, or corresponding memory cost when computation is replaced with table look-up. The other is the low processing speed caused by serial coding/decoding. A natural approach is to use logarithm to convert the computationally expensive multiplies and divides into additions and subtractions. However, it’s not straightforward because implementation of AC on logarithm domain will cause mismatch between coder and decoder due to the finite precision. That is absolutely intolerable because one error in the serial process would cause a chain of subsequent errors, making all following symbols not decodable. Another obstacle is how to adaptively estimate the probability on logarithm domain. We know that logarithm domain facilitates multiplies and divides in the original domain. But we have to switch between the two domains when there’s addition or subtraction in the original domain. That’s where the difficulties lie. Switches take place primarily in two cases: 1) when LPS occurs, the lower bound C will add up rMPS to form the new lower bound, and the range will be updated to rLPS. Here, to ensure the bit stream decodable, we have to restrict the sum of rMPS and rLPS not to exceed the original interval R. 2) as 1 LPS MPS p p + = , it’s impossible to renew both , LPS MPS p p with addition or subtraction in the logarithm domain. Generally speaking, arithmetic coding process includes both addition/subtraction and multiplies/divides in the original domain. The latter one can be facilitated by using logarithm, but at the sacrifice of inevitable switches between the two domains. In IBM’s patent about arithmetic codec on logarithm domain by Mitchell etc. [6]-[7], they carried out switching process in a direct way: looking up specific log and antilog table. They avoid the overlap of the intervals in the original domain by imposing the constraint rMPS rLPS R + ≤ . As rMPS and rLPS are converted form the logarithm domain, antilog table is designed to ensure the above inequality. Adaptive probability estimation is accomplished by state transitions defined in a fixed table of finite probability states. Each state corresponds to a quantized value of estimated LPS p or MPS p . So, the memory cost of this implementation at least include: 1) log and antilog table. As the precision is 12-bits in the patent, the size of table is no less than 8192 bytes. 2) state transition table. Taking 64 states used in H.264 as an example, for each state, next state value when MPS or LPS happens is stored, and the transition table takes up 128 bytes. 3) mapping table to map state value to logarithm of LPS p or MPS p . This will take up 128 bytes if logarithm of both LPS p and MPS p are stored. Totally, about 10K bytes memory is needed. Moreover, the memory access rate is high because: 1) each time for coding LPS, reading antilog table two times is needed to update C and R, and reading log table one time is needed for controlled rounding. 2) each time for coding one bin, reading state transition table one time is needed. 3) each time for coding one bin, reading mapping table one time to get the logarithm of the estimated probability is needed. To sum up, both the memory cost and the access rate are high in their proposed scheme. This will cause difficulty for real time implementation, especially on PC or DSP, because memory access time accounts for most of the processing time. We find that to ensure that coder and decoder match, we can apply logarithm only to calculating the renewed R when MPS happens, and use subtraction to calculate renewed R when LPS happens. Therefore, we guarantee that rMPS rLPS R + = always happen. In Fig.1 we show a complete cycle in coding process. The process is an iterative one which is made up of such cycles. All coding is carried out with 8bits precision. When MPS happens, we only need to update the range value on logarithm domain using additions. We use LG_x indicating logarithm of x, here x may be any variable. The renewal of R is given as LG _ R LG _ R _ MPS LG P ← + . When LPS happens, the work is a bit complicated. Suppose R1 is the previous range, and R2 is the corresponding range to rMPS . The updated range value is rLPS , calculated by R1 R2. First we have to get the value of R1 and R2 from LG_R1 and LG_R2. This is done by applying approximation with modification. Suppose LG_R1=s1+t1 (0 t1 1) ≤ < , LG_R2=s2+t2 (0 t2 1) ≤ < , here s1 and s2 are integers. As R1>R2 R1/2 ≥ , we can obtain s2=s1 or s2=s1-1. As x 2 1 x (0 x 1) ≈ + ≤ ≤ , we can directly use t1 t2 (if s2=s1) or (t1<<1) t2 (if s2=s1-1) to calculate rLPS. Modification can be applied to achieve higher precision. For antilog computation with 8bits precision, experimental results show a modification table with 64 indexes (first 6bits of t1 or t2) is enough. The compression gain of modification over no modification is not obvious, not more than 2%. Then an addition is carried out to calculate the renewed lower bound C. Renormalization is carried out when necessary to ensure that MSB of the updated range value is always 1. Then, one cycle is finished, as in Fig.1. rLPS is the new range and low_new is the new lower bound. Then, another cycle of consecutive MPS and one LPS starts to work. The new R has to be converted to logarithm so that subsequent range renewal can be calculated on logarithm domain. This conversion is also realized by approximation with modification. x 1 x 2 (0 x 1) + ≤ ≤ and ln(1+y) y (0 y 1) ≤ ≤ are very close when x+y=1, nearly the same for 8bits precision. So we use the same modification table for bi-direction conversions. To solve the problem of adaptive probability estimation on logarithm domain, the following is the theoretical base. We renew the probability estimation using the following formulae (taking that source is an independent identical distributed binary sequence of probability p as hypothesis), here f is smoothing factor adjusting the adaptation speed: _ _ ( ) { _ _ ( ) MPS MPS LPS LPS P new f P old if LPS occurs P new f P old if MPS occurs = = i i We can prove that the expected of the estimate k p (either LPS p or MPS p ) after k events is 0 ( ) ( ) k k E p p p p f = + i , which converges to p for all 0 1 f < < . We use 15 /16 f = for the estimation adaptation, and only keeps updating logarithm of MPS p . When LPS happens, because 23 ln(15/16) 256 ≈ i , renewed _ MPS LG P is given as _ _ 23 MPS MPS LG P LG P ← + When MPS happens, renewed _ LPS LG P is given as 15 16 LPS LPS p p = i As 0.5 1 MPS p ≤ < , 0 _ 1 MPS LG P ≤ < , we can obtain _ 1 =12 1 (1 _ ) _ MPS LG P LPS MPS MPS MPS p p LG P LG P = − ≈ − + = . Therefore, renewal of _ MPS LG P can be given by _ _ ( _ 4) MPS MPS MPS LG P LG P LG P ← − >> Of course, other probability techniques may be applied to the logarithm domain. What we present here is just an example. As a result, 64 bytes table plus addition and subtraction are enough for implementation of practical arithmetic codec on logarithm domain. This mechanism will save time and computational cost as a whole. For encoder with rate distortion optimization, much time can be saved for mode decision cycles, because addition on logarithm domain is the only computation that’s applied to estimate bit stream rate. Primary experiments show that the new AC engine is equivalent in compression efficiency to that of H.264. Corresponding hardware architecture is still under exploration. Fig.1 arithmetic coding cycle REFERENCES[1]M.Mrak, D.Marpe & T.Wiegand,“A context modeling algorithm and its application in video compression“,presented at ICIP,Barcelona, Spain, Sept.2003[2]S.Saponara , C. Blanch, K.denolf & J.Bormans , “The JVT advanced video coding standard: Complexity and performanceanalysis on a tool by tool basis“, in IEEE Packet Video 2003[3] I.H.Witten, R.M.Neal & J.G.Cleary, “Arithmetic Coding for Data Compression”, Comm.ACM30 (June 1987),520-540[4]W.B.Pennebaker, J.L.Mitchell, G.G.Langdon & R.B.Arps, “An overview of the Basic principles of the Q-Coder AdaptiveBinary Arithmetic Coder“, IBM J.Res.Develop.32 (Nov.1988), 717-726.[5] W.B.Pennebaker & J.L.Mitchell, “Optimal Hardware and Software Arithmetic Coding Procedures for the Q-Coder“, IBMJ.Res.Develop.32 (Nov.1988), 727.[6] MITCHELL JOAN L, PENNEBAKER WILLIAM B, GOERTZEL GERALD, US4791403 Log encoder/decorder system,[7] MITCHELL JOAN L, PENNEBAKER WILLIAM B, GOERTZEL GERALD, EP0225488 Log encoding/decoding method

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ENTROPY OF GEODESIC FLOWS ON SUBSPACES OF HECKE SURFACE WITH ARITHMETIC CODE

There are dierent ways to code the geodesic flows on surfaces with negative curvature. Such code spaces give a useful tool to verify the dynamical properties of geodesic flows. Here we consider special subspaces of geodesic flows on Hecke surface whose arithmetic codings varies on a set with innite alphabet. Then we will compare the topological complexity of them by computing their topological ...

متن کامل

Pure, Declarative, and Constructive Arithmetic Relations (Declarative Pearl)

We present decidable logic programs for addition, multiplication, division with remainder, exponentiation, and logarithm with remainder over the unbounded domain of natural numbers. Our predicates represent relations without mode restrictions or annotations. They are fully decidable under the common, DFS-like, SLD resolution strategy of Prolog or under an interleaving refinement of DFS. We prov...

متن کامل

Screaming fast Galois field arithmetic using intel SIMD instructions

Galois Field arithmetic forms the basis of Reed-Solomon and other erasure coding techniques to protect storage systems from failures. Most implementations of Galois Field arithmetic rely on multiplication tables or discrete logarithms to perform this operation. However, the advent of 128-bit instructions, such as Intel’s Streaming SIMD Extensions, allows us to perform Galois Field arithmetic mu...

متن کامل

Context-based adaptive arithmetic coding in time and frequency domain for the lossless compression of audio coding parameters at variable rate

This paper presents a novel lossless compression technique of the context-based adaptive arithmetic coding which can be used to further compress the quantized parameters in audio codec. The key feature of the new technique is the combination of the context model in time domain and frequency domain which is called time-frequency context model. It is used for the lossless compression of audio cod...

متن کامل

A Pixel Domain Video Coding based on Turbo code and Arithmetic code

In recent years, with emerging applications such as multimedia sensors networks, wireless low-power surveillance and mobile camera phones, the traditional video coding architecture in being challenged. In fact, these applications have different requirements than those of the broadcast video delivery systems: a low power consumption at the encoder side is essential. In this context, we propose a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005